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Abstract 



A critical variable of a satisfiable CNF formula is a variable that has the same value 
in all satisfying assignments. Using a simple case distinction on the fraction of critical 
variables of a CNF formula, we improve the running time for 3-SAT from 0(1.32216") 
by Rolf [9] to 0(1.32153"). Using a different approach, Iwama et al. [4] very re- 
cently achieved a running time of 0(1.32113"). Our method nicely combines with 
theirs, yielding the currently fastest known algorithm with running time 0(1.32065"). 
We also improve the bound for 4-SAT from 0(1.47390") [5] to 0(1.46928"), where 
0(1.46981") can be obtained using the methods of [5] and [9]. 

1 Introduction 

The ideas behind the most successful algorithms for A;-SAT are surprisingly simple. In 
1999, Paturi, Pudlak, and Zane [8] proposed the following algorithm. Given a fe-CNF 
formula F, we choose a variable x uniformly at random from the n variables in F, choose 
a truth value b G {0,1}, and set x to b, thereby replacing F by ^[^'"^''1, and continue 
with The value b is chosen as follows: If the formula contains the unit clause (x), 

we choose 6 = 1. If it contains (x), we choose 6 = 0. In these two cases, we say x was 
forced. If it contains neither, we choose 6 randomly and say x was guessed. Finally, if 
the formula contains both (x) and (x), we can give up, since the formula is unsatisfiable. 
This algorithm is usually called PPZ after its three inventors. 

Intuitively, if F is "strongly constrained", then the algorithm encounters many unit 
clauses, hence it needs to guess significantly fewer than n variables. On the other hand, 
if F is only "weakly constrained" , it has multiple satisfying assignments, making it easier 
to find one. Paturi, Pudlak and Zane [8] make this intuition precise and show that PPZ 
finds a satisfying assignment for a /e-CNF formula with probability at least 2^^^^^/^^)", 
provided there exists one. 

A couple of years later, Paturi, Pudlak, Saks, and Zane [7] came up with a simple but 
powerful idea. In a preprocessing step, they apply a restricted version of resolution. This 
increases the number of unit clauses the algorithm encounters and therefore increases its 
success probability. This gives an algorithm called PPSZ. If F has a unique satisfying 



assignment, its success probability is quite good (for 3-SAT, it is 0(1.308"")), and the 
analysis is highly elegant. The case of multiple satisfying assignments appears to be much 
more difficult and has been the subject of several papers so far. Iwama and Tamaki [5] 
made a major step forward when they observed that while the success probability of 
PPSZ deteriorates as the number of satisfying assignments increases, that of Schoning's 
random walk algorithm [10] improves. They quantified this tradeoff and obtained an algo- 
rithm with a success probability of 17(1.32373"")^. We denote this combined algorithm, 
consisting of one run of PPSZ and one run of Schoning's random walk algorithm, by 
Comb. 

The PPSZ paper. There are two versions of [7], which we call the old version and 
the new version. For unique /c-SAT, both are the same, but for general A;-SAT, the old 
version of [7] gives a more complicated analysis. The old version gives a better bound for 
3-SAT and the new version gives a better bound for 4-SAT. 

Only the new version is published, but the old version is still available at the Citeseer 
cache^. However, we have found some minor errors in that version. There is also a 
conference version [6] stating the results of the old version of [7], but without most proofs. 
Rolf [9] improved the analysis of the old version to get a bound of $7(1.32216"). However [9] 
does not consider 4-SAT. We use the ideas of [9] for our improvement of 4-SAT. In Timon 
Hertli's master thesis [2], the old version of [7] with the result of [9] is presented in a self- 
contained way. We will reference that thesis for detailed proofs. 

1.1 Our Contribution 

Let F be a satisfiable CNF formula over n variables and x be a variable therein. We call 
X critical if all satisfying assignments of F agree on x. Equivalently, x is critical if exactly 
one of the formulas F^^'^^l and F^^^^^ is satisfiable. We denote by c{F) the fraction of 
critical variables, i.e., the number of critical variables divided by n; if n = 0, we define 



Our contribution consists of two statements: Theorem 1 shows that for our purposes 
we only need to consider formulas with many critical variables. Point 3 of Lemma 9 then 
implies that the success probability of PPSZ increases if F has many critical variables. 
This is obtained by slightly modifying the existing analysis of [7] and [9] by taking critical 
variables into account. However, Lemma 9 is somewhat technical and we need to embed 
it into a review of the existing analysis. Theorem 1 is very simple, so we state it here: 

Theorem 1. Let p,q,c* G [0,1] and a,b > 1 such that | = (l — ^) =: r. Suppose 
algorithm A runs in time a"2°^"^ and for every satisfiable (< k)- CNF formula F with 
c{F) > c* finds a satisfying assignment with probability at least p" (^)''^"^- Then there 
exists an algrotihm A! that runs in time max{a, 6}"2°'^"^ and for every satisfiable (< k)- 
CNF formula finds a satisfying assignment with probability at least min{p, g}" (^)°^"''- 

Obviously we can turn A! into a algorithm that finds a satisfying assignment in ex- 



Proof. By guessing j variables we mean fixing in F j variables chosen uniformly at random 
to values chosen uniformly at random, obtaining the formula F' over at most n — j 

^Using the new version of [7] immediately gives the bound 0(1.32267""), as stated in [9]. 
^ http : //citeseerx. ist . psu. edu / viewdoc / summary ?doi= 10.1.1.41.1134 



c{F) := 1. 
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variables. A' for each j G {0, . . . ,n} repeats the fohowiiig V times: Guess j variables 
and then run A on F'; the running time bound is trivial. To bound the probability, 
we first claim that there exists a j such that aj > where aj is the probability that 
after guessing j variables F' is satisfiable and c{F') > c*. Suppose this is not the case: 
Let bj be the probability that after guessing j variables F' is satisfiable and c(F') < c* . 
Clearly oq + 60 = 1 since F is satisfiable, and a^+i + biJ^i > h ■ r, as guessing one 
variable preserves satisfiability with probability at least (l — %) = f. By the assumption, 

bi - r > (^Qi + bi— • r; from this it is easy to show that a„ + 6^ > r" — = 

If j = n, we have c(F') = 1 by definition; hence 6„ = and a„ > a contradiction. 
Now let j* be the j given by the claim; we repeat b>* times an algorithm that has success 
probability at least ^^p^~-' (2) ;asr-6 = g this gives by a routine argument an 
algorithm with success probability at least p^~^*q^* (|)°*'"^^ □ 

We improve the analysis for PPSZ for formulas with many critical variables. In 

combination with Theorem 1, this gives a success probability of J7(1.32153~") for 3- 
SAT and 46928"") for 4-SAT. Very recently, Iwama, Seto, Takai, and Tamaki [4] 
showed how to combine an improved version of Schoning's algorithm [3, 1] with PPSZ 
and achieved expected running time of 0(1.32113"). We combine our improvement with 
theirs to obtain a bound of 0(1.32065"). In the main part, we show a bound 0(1.321") 
that still improves on the bound of [4]. In the appendix we prove the better bound. The 
only change is we use a better result of [4] which has different parameters; however these 
are not not stated explicitly, so we need to derive and prove them. 

Wc analyze the algorithm Comb(F), where is a CNF formula. COMB consists 
essentially of a call to PPSZ [7] and to SCHOENING [10]. In [5] it was shown that COMB 
has a better success probability than what the analysis of PPSZ and Schoening gives. 
Let ISTT be the algorithm of [4] that improves COMB. 

Theorem 2. There exists an algorithm that for every satisfiable 3-CNF formula finds a 
satisfying assignment with probability $7(1.32153"") and runs in subexponential time. 

Theorem 3. There exists an algorithm that for every satisfiable 3-CNF formula finds a 
satisfying assignment with expected running time 0(1.32065"). 

The previous theorem is proved in the appendix. We prove the following weaker 
theorem in the main section: 

Theorem 4. There exists an algorithm that for every satisfiable 3-CNF formula finds a 
satisfying assignment with expected running time 0(1.321"). 

Theorem 5. There exists an algorithm that for every satisfiable 3-CNF formula finds a 
satisfying assignment with probability $7(1.46928"") and runs in subexponential time. 

This is already very close to unique 4-SAT, which has a success probability of 
fi(1.46899~"). The benefit of Theorem 1 is that when proving Theorems 2 and 5, 

we only need to consider formulas with many critical variables. For example, to prove 
Theorem 2, wc choose c* such that 1 - c* /2 = 1/1.32153, i.e., c* « 0.4866. Then we 
have to bound from below the success probability of COMB for 3-CNF formulas F with 
c(F) > c*. 
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1.2 Notation 



We use the notational framework introduced in [11]. We assume an infinite supply of 
propositional variables. A literal n is a variable a: or a complemented variable x. A finite 
set C of literals over pairwise distinct variables is called a clause and a finite set of clauses 
is a formula in CNF (Conjunctive Normal Form). We say that a variable x occurs in a 
clause C if either x or x are contained in it and that x occurs in the formula F if there is 
any clause where it occurs. We write vbl(C) or vhl{F) to denote the set of variables that 
occur in C or in F, respectively. A clause containing exactly one literal is called a unit 
clause. We say that F is a (< k)-CNF formula if every clause has size at most k. Let 
such an F be given and write V := vbl(-F) and n := \ V\. 

A assignment is a function a : V — )■ {0, 1} which assigns a Boolean value to each 
variable. A literal u = x {oi u = x) is satisfied by a if a(x) = 1 (or a{x) = 0). A clause 
is satisfied by a if it contains a satisfied literal and a formula is satisfied by a if all of 
its clauses are. A formula is satisfiable if there exists a satisfying truth assignment to its 
variables. 

For an assignment a on V and a set C V, we denote hy a (B W the assignment 
that corresponds to a on variables ofV\W and is flipped on variables of W. 

Given a CNF formula F, we denote by sat(F) the set of assignments that satisfy -F. 

Formulas can be manipulated by permanently assigning values to variables. If F is a 
given CNF formula and x G vbl(F) then assigning x i-^ 1 satisfies all clauses containing x 
(irrespective of what values the other variables in those closes are possibly assigned later) 
whilst it truncates all clauses containing x to their remaining literals. 

We will write F^^^^'^ (and analogously fI^'^^J) to denote the formula arising from 
doing just this. 

We say that two clauses Ci and C2 conflict on a variable x if one of them contains 
x and the other x. We call Ci and C2 a resolvable pair if they conflict in exactly one 
variable x, and we dcflnc their resolvent by R{Ci, C2) ■= (Ci U C2) \ {x,x}. It is easy to 
see that if F contains a resolvable pair Ci, C2, then sat(F) = sat(FU {i?(Ci, C2)}). A 
resolvable pair Ci, C2 is s-bounded if |Ci| < s, IC2I < s, and |i2(Ci,C2)| < s. 

By Resolve(F, s), we denote the set of clauses C that have an s-bounded resolution 
deduction from F. By a straightforward algorithm, we can compute Resolve(F, s) in 
time O (n^*poly(n)) [7]. 

By choosing an element u.a.r. from a finite set, we mean choosing it uniformly at 
random. By choosing an element u.a.r. from an closed real interval, we mean choosing 
it according to the continuous uniform distribution over this interval. Unless otherwise 
stated, all random choices are mutually independent. 

We denote by log the logarithm to the base 2. For the logarithm to the base e, we 
write In. We define OlogO := 0. 

2 Proof of the Main Theorems 

In the following let A; > 3 be a fixed integer. Let F be a satisfiable (< A;)-CNF formula, 
V := vbl(F) and n := \V\. We first give the concepts from [7] needed to understand 
Lemma 9. Then we state the lemma and use it to improve the bounds on the success 
probability of COMB and ISTT given sufficiently many critical variables. In Section 3, we 
prove Lemma 9 and also consider 4-SAT. Most concepts used in the proof are from [7, 9]. 
Our contribution is to exploit what these concepts yield for critical variables. 
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Algorithm 1 PPSZ(CNF formula F, assignment /?, permutation vr) 

Let a be a partial assignment over vhl{F), initially the empty assignment. 

G ^ RESOLVE(F,log(|vbl(F)|)) 

for all X G vbl(G), according to tt do 
if {x} G G then 

a{x) ^ 1 
else if {x} G G then 

a{x) 
else 

a{x) <r- P{x) 
end if 

end for 
return a 



Algorithm 2 PPSZ(CNF formula F) 
{this algorithm is used for 4-SAT} 
Choose /3 u.a.r. from all assignments on vbl(F) 
Choose vr u.a.r. from all permutations of vbl(F) 
return PPSZ(F, tt) 



Subcubes. Fov D C V and a G {0,1}^, the set B{D,a) := {/3 G {0,1}^ | a{x) = 
/3{x) G D} is called a subcube. The variables in D are called defining variables and 
those mV\D nondefining variables. The subcube B{D,/3) has dimension \ V \ D\. For 
example, if V = {xi,X2,X3}, D = {xi,X3} and a = (1,0,0), then B{D,a) contains 
exactly the two assignments (1,0,0) and (1,1,0). Given a nonempty set S C {0,1}^, 
there is a partition 

{0,1}^= [JB^ 

where the Ba are pairwise disjoint subcubes, and a G B^ for all a G 5". See [7] for a 
proof. For the rest of the paper, we fix such a partition for S being the set of satisfying 
assignments. To estimate the success probability of COMB, consider the assignment /3 
that Comb chooses uniformly at random from {0, 1}^. 

Pr[COMB(F) G sat(F)] = ^ Pr[COMB(F) G sat(F)| /3 G 5^] • Pr[/3 G Ba] 

aesat(F) 

> min Pr[COMB(F) G sat(F) | /3 G -Ba]. 

Qesat(F) 

Hence instead of analyzing COMB for an assignment /3 sampled uniformly at random from 
all assignments, we fix o; G sat(F) arbitrarily and we think of /3 as being sampled from 
the subcube Ba- Let be the set of non-defining variables of this cube, and Dq, the set 
of defining variables. Intuitively, if Ba has small dimension, then P is likely to be close to 
a, thus SCHOENING has a better success probability: 

Lemma 6 ([5]). Pr[SCHOENlNG(F, /3) G sat(F) | /3 G S^] > (2 - 2/A:)-l^«l. 

Placements. As a next step, we analyze PPSZ(F, /3, vr) with P chosen uniformly at 
random from Ba and the permutation also chosen from some subset of permutations. A 
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Algorithm 3 Schoening(CNF formula F, assignment /3) 
for 3|vbl(F)| steps do 
if (3 satisfies F then 

return /3 
end if 

Select an arbitrary C E F not satisfied by (3 

Select a variable x u.a.r. from vbl(C) and flip x in /3 

end for 

return /3 



Algorithm 4 Comb(CNF formula F) 
{this algorithm is used for 3-SAT} 
Choose P u.a.r. from all assignments on vbl(F) 
a ^ PPSZ(F,/3) 
if a sat(F) then 

a ^ Schoening(F, /3) 
end if 
return a 



placement of the variables is a function a : V ^ [0,1], and a uniform random placement 
is defined by chosing a{x) uniformly at random from [0, 1] independently for each x G V. 
With probability 1, a uniform random placement is injective and gives rise to a uniformly 
distributed permutation via the natural ordering < on [0, 1]. For the rest of the paper, 
we will view tt as a placement rather than a permutation. Let F be a measurable set of 
placements. Then 

Pr[PPSZ(F,;3,7r) G sat(F) | ^ G B„] > 

Pr[PPSZ(F,;3,7r) G sat(F) | /3 G tt G F] • Pr[7r G F]. 

The benefit of this is that we can tailor F towards our needs, i.e., making the conditional 
probability Pr[PPSZ(F, ^, tt) G sat(F) | ^ G Ba,Tr G F] fairly large. This may come at 
the cost of making Pr[7r G F] small. 

Forced variables. Suppose the permutation it orders the variables V as (xi, . . . ,Xn)- 
Let a be a satisfying assignment of F. Imagine we call PPSZ(F, a, tt). The algorithm 
applies bounded resolution to F, obtaining G = Resolve (i*", log(n)) and sets the variables 
xi, . . . ,Xn step by step to their respective values under a, creating a sequence of formulas 
by G = Go,Gi, . . . ,Gn, where Gj = for 1 < i < n. Since a is a satisfying 

assignment, G„ is the empty formula. We say Xj is forced with respect to a and tt if 
contains the unit clause {xj} or {xj}. By forced(a, tt) we denote the set of variables x 
that are forced with respect to a and tt. If x is not forced, we say it is guessed. We denote 
by guessed(Q!, vr) the set of guessed variables. Note that PPSZ(F, /3, tt) returns a if and 
only if a(x) = P{x) for all x G guessed(Q:, tt). Furthermore, since /3 is chosen uniformly at 
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(1) 



> 2 



, - E [ I ATtj nguessed (a ,7r ) I ] 



(2) 



where the inequahty comes from Jensen's inequahty apphed to the convex function t i— )• 
2~*. Note that (2) holds when taking tt uniformly at random as well as when sampling it 
from some set T. Using linearity of expectation, we see that 



Now if a is the unique satisfying assignment, then = V. For 3-SAT, one central result 
of [7] is that 

Lemma 7 ([7]). Let F be a satisfiable 3- CNF formula with a unique satisfying assignment 
a. Then for every x G vbl(F), it holds that Pr[a; G guessed(a, tt)] < 21n(2) — 1 + o(l) < 



Combining the lemma with (2) shows that PPSZ on 3-CNF formulas with a unique 
satisfying assignment has a success probability of at least 2-(2i"i(2)-i+o(i))" ^ 0(1.308-"). 
For the case of multiple satisfying assignments, the lemma does not hold anymore. 

Critical variables. Let F be a satisfiable CNF formula and x a variable. Recall that 
we call x critical if all satisfying assignments of F agree on x. The following observation 
is not difficult to show: 

Observation 8. Let F he a satisfiable CNF formula and let Vc he the set of critical 
variables. Let be the subcube as defined above. For a satisfying assignment a, let 
be the set of nondefining variables. Then Vc CI Na- 

Lemma 9. Let F be a satisfiable 3-CNF formula and a be a satisfying assignment. 
There is a measurable set F C [0, 1]^ of placements such that for /3 = 0.8022563838 and 
7 = 0.6073995502, we have 

1. Pr[7r G F] > 2-/^l^«l-°(") « 0.57345159l^"l-"("), 

2. Fr[x G forced(a,7r) | vr G F] > 7 - o(l) « 0.6073995502 - o(l) for all x G N^, 

3. Vy[x G forced(Q;, tt) | tt G F] > 2 - 2 ln(2) - o(l) ^ 0.6137056 for all critical xeV. 

The important part of the lemma is point 3, namely that critical variables are forced 
with a larger probability than non-critical ones. 

Proof of Theorem 2. Using Theorem 1, we can assume c{F) > 0.48659459. Let A := 
|-Da|/|^| = 1 — |-^o|/|^| be the fraction of defining variables. Combining (3) with 
Lemma 9, we obtain 

E[|A^cK n guessed(Q;,7r)| | tt G F] = ^ Pr[a; G guessed(a, tt)] 




E[|A^Q, n guessed (a, tt) I ] = ^ Pr[a; G guessed(Q!, tt)]. 

xeNa 



(3) 



0.3863. 




< (21n2-l)|yc| + (l-7)|A^a\Vc!+o(n) 

< (2 In 2 - l)c*n + (1 - 7)(1 - A - c*)n + o(n) 
= 0.389532n - 0.3926004498 An + o(n) . 



7 



The expected fraction of nondefining variables we have to guess is thus a httle bit larger 
than in the case of a unique satisfying assignment, where it is « 0.3863. Together with 
(2), we conclude that the success probability of PPSZ is at least 

Pr[PPSZ(F, P,Tr) = a\PeBa] > Pr[PPSZ(F, /3, tt) = a | /3 G tt G T] • Prfvr G F] 

y 2-E[|iV„nguessed(Q;,7r)| | neT] , p^j^ ^ pj 

> 2~0-389532n+0.3926004498An . q 57345159^" . 2~''('*) 

> 1.3099684"" • 1.328369"^'' • 2-°("). (4) 

Our bound on the success probability of PPSZ thus deteriorates with the number of 
defining variables. A bigger subcube is better for PPSZ. We combine this with the 
bound for Schoning's algorithm from Iwama and Tamaki [5] , stated above in Lemma 6 

Pr[SCHOENlNG(F, /3) G sat(F) | /3 G S^] > (2 - 2/A;)-(^-^)". (5) 

The combined worst case is with A « 0.0309273, in which case both (4) and (5) 
evaluate to 32153""). Therefore for any A, at least one of Schoening and PPSZ 
has a success probability of $7(1.32153""). □ 

Proof of Theorem 4- Lemma 6 from [4] tells us that there is an algorithm ISTTSCH that 
improves SCHOENiNG such that for all m* G [0, |] we have, after preprocessing time 6"**", 

Pr[ISTTSCH(F,/3) G sat(F) \ l3 e B^] > 1.012795™' " • 1.2845745^" • (3/4)". 

We want to prove that by replacing SCHOENiNG with ISTTSCH in COMB, we obtain 
expected running time of 0(1.321"). Setting c* := 0.48599 and m* := 0.155371873 gives 
1 - c*/2 > 1/1.321 and 6"** > 1.321. With this choice of c*, we have the following bound 
for PPSZ (obtained as in the previous proof, but with a different constant c*): 

Pr[PPSZ(F,/3,7r) = a | /? G 5«] > 1.31"" ■ 1.3312"^" ■ 2-<"-\ 

The combined worst case is at A « 0.029225 where 1.31"" • 1.3312"^" > 1.321"" and 
1.012795"** " • 1.2845745^" ■ (3/4)" > 1.321"", proving that the combined success proba- 
bility is 0(1.321"") (after preprocessing time 0(1.321")). □ 



3 Proof of Lemma 9 
3.1 Critical Clause Trees 

Let G := Resolve(F, log(n)). Note that vbl(F) = vbl(G) and sat(F) = sat(G). A 
critical clause for x G F w.r.t. a is a clause where a satisfies exactly one literal and this 
literal is over x. It can be easily seen that if the output of PPSZ should be a, then 
exactly the critical clauses of G are the clauses that might turn into unit clauses. Note 
that the defining variables are assumed to be set correctly, so we only need to consider 
critical clauses for nondefining variables here. 

We now define critical clause trees, a concept that tells us which critical clauses we 
can expect in a CNF formula after bounded resolution. Let T be a rooted tree in which 
every node is either labeled with a variable from V or is unlabeled. A cut in a rooted 
tree is a set of nodes A such that the root is not in A and every path from the root to a 
leaf contains at least one node in A. The depth of a node is the distance to the root. For 
a set A of nodes, vbl(^) denotes the set of variables occurring as labels in A. We say T 
is a critical clause tree for x w.r.t. G and a if the following properties hold: 
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1. The root is labeled by x. 



2. On any path from the root to a leaf, no two nodes have the same label. 

3. For any cut A of the tree, there is a critical clause C e G w.r.t. a where the satisfied 
literal is over x and every unsatisfied literal is over some variable in vbl(A). 

It is shown in [7] that we can construct a critical clause tree 
for X G Nq, as follows: Start with the root labeled x. Now we 
can repeatedly extend a leaf node v. Let L be the set of labels 
that occur on the path from v to the root. If a © L does not 
satisfy F, then we can extend the tree at that node: There is a 
clause C in F (not in G) not satisfied by a0 L. For each literal 
in C that is not satisfied by a, we add a child to v labeled with 
the variable of that literal. If there are no such literals, we add 
an unlabeled node. As clauses of F have at most k literals, each 
node has at most A; — 1 children. If the constructed tree has at 
most log(n) nodes (as wc do log(n)-boundcd resolution), then 
it is a critical clause tree for x w.r.t. G and a. 

We give a simple example: Let 

F := {{x, y, z}, {x, y, a}, {z, b, c}, {x, z, c}}. 

For the all-one assignment and x, we can get the tree shown in Figure 1 by the de- 
scribed procedure. {a,b} is a cut in this tree. Wc have R{{z,b,c},{x, z,c}) = {x,z,b}, 
R{{x,y,z},{x,y,a}) = {x,z,a} and R{{x, z,b},{x,z,a}) = {x,a,b}, giving the required 
critical clause. 

If a is the only satisfying assignment of F, a L never satisfies F, and we can build 
a tree where all leafs are at depth d := [ log^(log(n))J . We call this a full tree. The 
important observation is now that this also works if x is a critical variable, as in that case 
a © L also never satisfies F, as x G L. 

In the general case, however, the assignment a © L might satisfy F so that we cannot 
extend the tree. However if L consists only of nondefining variables, then we know that 
a © L does not satisfy F. Hence we can get a tree where every leaf not at depth d is 
labeled by a defining variable. We define the trees we will use in the analysis: 

Definition 10. For x G N^, construct the critical clause tree for x as follows: If x is 
a critical variable, then construct such that all leaves are at depth d, i.e., construct a 
full tree. Otherwise, construct such that all leaves not labeled by defining variables are 
at depth d. 

This means that a tree might just consist of a root where all children are labeled with 
defining variables, which essentially nullifies the benefits from resolution. To cope with 
this, we have to make defining variables more likely to occur at the beginning. We achieve 
this by choosing the set T of placements whose existence we claim in Lemma 9 in a way 
such that exactly that happens. 

Definition 11. A function H : [0,1] — )■ [0,1] is called a nice distribution function if H 
is non- decreasing, uniformly continuous, H{0) = 0, H(l) = \, H is differentiable except 
for finitely many points and H(r) > r. 




Figure 1: Example Crit- 
ical Clause Tree 
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Compared with [7], we added the requirement H(r) > r. This wiU mean that defining 
variables cannot be less likely to occur at the beginning than nondefining variables. We 
now define a random placement where defining variables are placed with distribution 
function H: 

Definition 12. Let H be a nice distribution function. By tth, we define the random 
placement on V s.t. Tr{x) for x G N^, is u.a.r. G [0, 1], and for x G Da and r G [0, 1], 
Pr(7r(x) < r) = H{r). 

Assume that the variables are processed according to some placement tt. Consider T^. 
If there is a cut A such that 7r(y) < tt{x) for every y G vbl(A), then x is forced, as the 
corresponding critical clause has turned into a unit clause for x. Denote the probability 
that Sxiir) is a cut in by Q{Tx, tt). 

For r G [0, 1], let Rk{r) be the smallest non- negative x that satisfies x = (r + (1 — 
r)x)^~^ and Rk '■= Jq Rk{r)dr. It was shown in [7] that if is a full tree, then 

Q{T^,7ru)>Rk-o{l). 

Rk{r) can be understood as follows: Take an infinite {k — l)-ary tree and mark each node 
as "dead" with probability r, except the root. Rk{r) is the probability that this tree 
contains an infinite path that starts at the root and contains only "alive" nodes. 

We have i?3 = 2 - 21n2 0.6137 and R4 ~ 0.4451. For r G [0, 1], we have i?3(r) = 

^jf^j and for r G [^, 1], we have Rsir) = 1. As H{r) > r, and by definition of tth and 
of a cut, it is obvious that 

Q{n,TrH)>Rk-o{l), (6) 

if Tx is a full tree. If is not a full tree, we do not have any good bounds on Q{Tx, nu). 
In [9] it is shown that if is not necessarily a full tree, but a tree in which every leaf not 
at depth d is labeled by a defining variable, then 

Q{Tx,TrH)>lH-oil), (7) 

where ^ 

7if= / min{H{r)^-\Rkir)}dr. 
Jo 

Obviously < Rk-, which means that the bound (6) for full trees is at least as strong as 
the bound (7) for general trees. The H{r)^^^ term corresponds to the tree that consists of 
a root where all children are labeled with defining variables and are thus leaves (remember 
that there are at most k — 1 children) . It takes a small lemma to show that this tree and 
the full tree are the worst cases. See [2] for details. The following observation summarizes 
this: 

Observation 13. If x is a critical variable, then Q{Tx,ith) > Rk — o{l). If x is a 
noncritical nondefining variable, then Q{Tx,'Kh) > IH — o{l). 

We want to find a set F of placements such that a placement chosen uniformly at 
random from F behaves more or less like tt^j. 

Lemma 14 (old version of [7]). Let H be a nice distribution function. If \Da\ > \fn, 
there is a set of ■placements F depending on n with the following properties: Let wr be the 
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placement choosen uniformly at random from T. Then for any tree T with at most log(n) 
nodes we have 

QiT,TTr)>Q{T,7rH)-o{l) 

and 

Pr{TTu G r) > 2-MD^\~o{n) 

with ^ 

Ph '■= I h{r) log {h{r)) dr 
Jo 

where h{r) is the derivative of H(r). 

The proof of this lemma is long and complicated, see Sections 4.2 and 4.3 in [2]. The 



case \Dci\ < \fn is easy to handle: The probability that all defining variables come at the 
beginning is substantial, and we are essentially in the (good) unique case. 

Below we will show how to choose a good function K for the case /c = 3 and k = 4. 
To get an intuition, see Figure 2 for a plot of H for k = 3. With this function, one 
obtains -fn ~ 0.6073995502 and /3h ^ 0.8022563838. Together with Lemma 14 and 
Observation 13, we conclude that for a critical variable x 

Pt[x G forced(a, vr)] > Q(r^, vr/^) - o(l) > Rk - o(l) > 0.61371, 

and for a non-critical non-defining variable x 

Fi[x e forced(Q,7r)] > Q{T^,ith) > 7h - o(l) > 0.6073995502 - o(l). 

3.2 Choosing a good H 

Let now k = 3. We choose H as in [9]: Let 9 G [0.5,1] be a parameter. With some 
appropriate parameters a and 6 > 1, we define H{r) as follows: 



H(r) 



r/e ifrG[0,l-^) 
l-(-aln(r))^ if r G [1 - 6i, 1] 



3-SAT. To determine a and b, we set the con- 
straints 

H{i-e) = Rs{i-e)^/^ 

(as 9 > 1/2, this right-hand side is equal to 
and 

h{l-9) = 1/9. 

If these constraints are satisfied, H{r) is a nice 
distribution function that is differentiable on [0, 1]. 
Figure 2 gives a plot of the H{r) we use. Numerical 
optimization gives 9 ~ 0.52455825 and as before 
c* ^ 0.48659459. See Section 4.6 in [2] for details 
of the computation. This gives 

a ^ 0.96782885577, 
b ^ 7.19709520894, 
I3h < 0.8022563838, 
> 0.6073995502. 

This concludes the proof of Lemma 9. 




Figure 2: H{r) for 3-SAT 
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4-SAT. For 4-SAT, we use the H corresponding to the new version of [7]. For some 
parameter 9 G [|, 1], we let H{r) := min{g, 1}. It turns out that the optimum is when 
Ph = ^ — IH- In that case it is easily seen that the bound for PPSZ does not depend on 
\Da\, and hence we do not need SCHOENING. Numerical optimization gives 9 0.6803639 
and c* f« 0.63878808. This implies the success probability J^(1.46928-"), proving Theorem 
5. 

4 Conclusion 

We have shown how to improve PPSZ by a preprocessing step that guarantees that a 
substantial fraction of variables will be critical. With this, we were able to improve the 
bound for 3-SAT and 4-SAT from [9]. We have also shown that our approach nicely 
combines with the improvement by [4] by giving an even better bound. In 4-SAT, we 
are already very close to the unique case. We do not know if a more refined choice of H 
(similar to [9]), possibly depending on A, allows us to close that gap. 

It is interesting to see that we could make use of multiple assignments in the guessing 
step before considering just one assignment using the subcube partition. 
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A Proof of the 0(1.32065") bound 

In this section we prove that there exists an algorithm that for every satisfiable 3-CNF 
formula finds a satisfying assignment in expected running time 0(1.32065"), as stated in 
Theorem 3. 

First we show how to derive from [4] a statement similar to Lemma 6 of [4] . They have 
used such a lemma, but did not state it explicitly. Then analogously to before, we give 
the parameters 9, c* and m* (derived by numerical optimization) to prove the claimed 
bound. 

Lemma 15 ([4]). Let fm := H and fa := 1.28248358. Let A := |i5a|/|F|, as before. For 
m* G [0, |] we have after preprocessing time 0(6"**"') that 

Pr[ISTTSCH(F,/3) G sat(F) \ f3 e B^] > {f^r''^^' • (/,)^" • (3/4)". 

Note that = || ?a 1.015873 > 1.012795, which is corresponding number in Lemma 
6 of [4]; however f^ decreases from 1.2845745 to 1.28248358. This means that we are better 
if A is small, but worse if A is large. However, as the combined worst case is for small A 
{k. 0.0286138) , we improve the probability of the combined algorithm nonetheless. 

Proof. We can interpret ISTTSCH as follows: We first do a preprocessing step using an 
algorithm from Baumer and Schuler [1] that takes time 0(6™" "). This either finds a 
satisfying assignment of F with high probability or it finds a set of independent 3-clauses 
C (clauses that do not share variables) of size at least m* ■ n. In the latter case, this set of 
independent clauses is stored and ISTTSCH does the following: The initial assignment /3 
is modified on the variables of C to an assignment (5' . Then Schoening(F, j3') is called. 
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type j of C e(j), the SCHOENING success probability on the variables of C 




Table 1: Clause type and SCHOENiNG success probability 



In [4] it was shown that we can look at each clause in C € C independently in terms 
of the probability of Schoening(F, /?'). For a satisfying assignment a, we determine the 
type of a clause C by the number of literals that correspond to non-defining variables, 
defining variables as satisfied literals, and defining variables as dissatisfied literals. There 
are 9 types, which are denoted by 0,10,11,20,21,22,31,32,33. The first digit denotes 
the number of defining variables of the literals of C, the second digit denotes the number 
of defining variables corresponding to satisfied literals. The corresponding probability of 
SCHOENiNG is fisted in Tabic 1, as in Table 3 of [4]. 

Iwama et al. have then shown that there are 16 patterns how the subcube partition 
(dependent on the independent 3-clauses C) of the assignments on the variables of a clause 
can result in these types, as shown in Table 2. Note that patterns 9, 10 and patterns 
13, 14 have the same type outcomes, but are noted as different patterns in [4]. Pattern 
corresponds to type and it was not treated explicitly as a pattern in [4]. Furthermore 
it was shown that with high probability the number of resulting types is close to the 
expectation. Let p{i,j) denote the probability that pattern i turns into type j. For type 
j, let d{j) denote the number of defining variables (i.e. the first digit). Then we have to 
show the following bound for every pattern i: 



The left-hand side corresponds to the expected Sghoening probability of a clause of 
pattern i; the right-hand side corresponds to the term we want in the statement of the 
lemma. See [4] for details. As fm is a rational number that is easily derived from pattern 
and hence type 0, we can check the following inequality for patterns 1 to 15: 



We have listed the numerical results of fd{j) in Table 3 (9 significant digits, rounded 
down). The worst case for fdij) is pattern 4, which corresponds to of the lemma 
statement. 



Starting from the previous lemma, we now prove Theorem 3. We let 6 := 0.5224565, 



c* ■■= 2- ^ 0.4855942149, m* := log6(1.32065) ^ 0.155223982 (c* and m* are 





□ 
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pattern number probability distribution of types 





1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 

14 

15 



1 : 
10 
11 
10 
20 
11 
10 
10 
10 
20 
20 
20 
20 
20 
20 
20 



1 

' 2 
1 

' 4 
1 

' 4 
2 



11 

20 
21 
21 
20 
22 
21 
31 
21 
21 
21 
22 
21 
21 
31 



1 

' 4 
1 
' 4 



21 
22 
22 

31, 
31, 
32, 
32, 
22, 
22, 
32, 
31, 
31, 
31, 
32, 



32 
32 
33 

33 
31, 
31, 
33 
32 
32, 
32, 
33 



32 
32 



33 
33 



Table 2: Probability distribution of types 



pattern j fd{j) 

1 1.28611973 

2 1.28272221 

3 1.28466750 

4 1.28248358 

5 1.29339711 

6 1.29507819 

7 1.29507819 

8 1.30294154 

9 1.29080377 

10 1.29080377 

11 1.29080377 

12 1.29749876 

13 1.29749876 

14 1.29749876 

15 1.30300231 

Table 3: fd{j) for pattern j 
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rounded down) . It is easily seen that the choice of c* and m* work for the bound we want 
to achieve. Note that if we would want to have more significant digits in the bound, we 
would need to lower c* and m* slightly. As before, using the H from [9], we have now 

a « 0.99012456677, 

b f« 7.85858019246, 
Ph < 0.8180299645, 
> 0.6083696059. 

We now obtain a lemma analogous to Lemma 9 but with different P and 7. It 
is straightforward to show analogously to before that we get the combined bound 
of 0(1.32065"*^) for one combined execution by considering the combined worst-case 
A ^ 0.0286138. 
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